{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c024bfa4-1a7a-4751-b5a1-827225a3478b",
   "metadata": {
    "id": "c024bfa4-1a7a-4751-b5a1-827225a3478b"
   },
   "source": [
    "**LLM Workshop 2024 by Sebastian Raschka**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "58b8c870-fb72-490e-8916-d8129bd5d1ff",
   "metadata": {
    "id": "58b8c870-fb72-490e-8916-d8129bd5d1ff"
   },
   "source": [
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "\n",
    "# 6) Instruction finetuning (part 2; finetuning)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "013b3a3f-f300-4994-b704-624bdcd6371b",
   "metadata": {},
   "source": [
    "- In this notebook, we get to the actual finetuning part\n",
    "- But first, let's briefly introduce a technique, called LoRA, that makes the finetuning more efficient\n",
    "- It's not required to use LoRA, but it can result in noticeable memory savings while still resulting in good modeling performance"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21532056-0ef4-4c98-82c7-e91f61c6485e",
   "metadata": {
    "id": "21532056-0ef4-4c98-82c7-e91f61c6485e"
   },
   "source": [
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "\n",
    "# 6.1 Introduction to LoRA"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "66edc999-3d91-4a1c-a157-9d056392e8d8",
   "metadata": {
    "id": "66edc999-3d91-4a1c-a157-9d056392e8d8"
   },
   "source": [
    "- Low-rank adaptation (LoRA) is a machine learning technique that modifies a pretrained model to better suit a specific, often smaller, dataset by adjusting only a small, low-rank subset of the model's parameters\n",
    "- This approach is important because it allows for efficient finetuning of large models on task-specific data, significantly reducing the computational cost and time required for finetuning"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5bb75b5d-d59c-4948-821a-1594a5883dc1",
   "metadata": {
    "id": "5bb75b5d-d59c-4948-821a-1594a5883dc1"
   },
   "source": [
    "- Suppose we have a large weight matrix $W$ for a given layer\n",
    "- During backpropagation, we learn a $\\Delta W$ matrix, which contains information on how much we want to update the original weights to minimize the loss function during training\n",
    "- In regular training and finetuning, the weight update is defined as follows:\n",
    "\n",
    "$$W_{\\text{updated}} = W + \\Delta W$$\n",
    "\n",
    "- The LoRA method proposed by [Hu et al.](https://arxiv.org/abs/2106.09685) offers a more efficient alternative to computing the weight updates $\\Delta W$ by learning an approximation of it, $\\Delta W \\approx AB$.\n",
    "- In other words, in LoRA, we have the following, where $A$ and $B$ are two small weight matrices:\n",
    "\n",
    "$$W_{\\text{updated}} = W + AB$$\n",
    "\n",
    "- The figure below illustrates these formulas for full finetuning and LoRA side by side"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8a7419d-cae9-4525-bb44-1641f6ef4f3b",
   "metadata": {
    "id": "a8a7419d-cae9-4525-bb44-1641f6ef4f3b"
   },
   "source": [
    "<img src=\"figures/08.png\" width=\"1100px\">"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4edd43c9-8ec5-48e6-b3fc-5fb3c16037cc",
   "metadata": {
    "id": "4edd43c9-8ec5-48e6-b3fc-5fb3c16037cc"
   },
   "source": [
    "- If you paid close attention, the full finetuning and LoRA depictions in the figure above look slightly different from the formulas I have shown earlier\n",
    "- That's due to the distributive law of matrix multiplication: we don't have to add the weights with the updated weights but can keep them separate\n",
    "- For instance, if $x$ is the input data, then we can write the following for regular finetuning:\n",
    "\n",
    "$$x (W+\\Delta W) = x W + x \\Delta W$$\n",
    "\n",
    "- Similarly, we can write the following for LoRA:\n",
    "\n",
    "$$x (W+A B) = x W + x A B$$\n",
    "\n",
    "- The fact that we can keep the LoRA weight matrices separate makes LoRA especially attractive\n",
    "- In practice, this means that we don't have to modify the weights of the pretrained model at all, as we can apply the LoRA matrices on the fly\n",
    "- After setting up the dataset and loading the model, we will implement LoRA in the code to make these concepts less abstract"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b214882e-68d5-46e4-a2a0-bae3e8500a9e",
   "metadata": {},
   "source": [
    "<img src=\"figures/09.png\" width=\"800px\">"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "78e4d859-440f-4438-9cea-54311a5fc13a",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "\n",
    "# 6.2 Creating training and test sets"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10333f8b-64dc-40f7-9508-657183384231",
   "metadata": {},
   "source": [
    "- There's one more thing before we can start finetuning: creating the training and test subsets\n",
    "- We will use 85% of the data for training and the remaining 15% for testing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6cabbaac-c690-448a-9cfb-33418fcfc27b",
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "\n",
    "file_path = \"LLM-workshop-2024/06_finetuning/instruction-data.json\"\n",
    "\n",
    "with open(file_path, \"r\") as file:\n",
    "    data = json.load(file)\n",
    "print(\"Number of entries:\", len(data))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "293281ab-5c7e-444c-bc13-b63a65be68f1",
   "metadata": {},
   "outputs": [],
   "source": [
    "train_portion = int(len(data) * 0.85)  # 85% for training\n",
    "test_portion = int(len(data) * 0.15)    # 15% for testing\n",
    "\n",
    "train_data = data[:train_portion]\n",
    "test_data = data[train_portion:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7e1027bb-0b3f-4e5f-863c-be5d91ef49ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Training set length:\", len(train_data))\n",
    "print(\"Test set length:\", len(test_data))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad75b683-856a-425c-ba8f-c887a937baa0",
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(\"train.json\", \"w\") as json_file:\n",
    "    json.dump(train_data, json_file, indent=4)\n",
    "    \n",
    "with open(\"test.json\", \"w\") as json_file:\n",
    "    json.dump(test_data, json_file, indent=4)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "46318684-d607-4f40-aa8a-cd2b17f879c6",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "\n",
    "# 6.3 Instruction finetuning"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95092e6d-cfe0-45e1-a8d6-aa5f72adaa32",
   "metadata": {},
   "source": [
    "- Using LitGPT, we can finetune the model via `litgpt finetune model_dir`\n",
    "- However, here, we will use LoRA finetuning `litgpt finetune_lora model_dir` since it will be quicker and less resource intensive"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "415041a4-5533-49ed-bcc8-3e1cb249bc65",
   "metadata": {},
   "outputs": [],
   "source": [
    "!litgpt finetune_lora microsoft/phi-2 \\\n",
    "--data JSON \\\n",
    "--data.val_split_fraction 0.1 \\\n",
    "--data.json_path train.json \\\n",
    "--train.epochs 3 \\\n",
    "--train.log_interval 100"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4edc913",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "\n",
    "# Exercise 1: Generate and save the test set model responses of the base model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c0324437-6d2f-4f59-9e78-39dd4d890a98",
   "metadata": {},
   "source": [
    "- In this excercise, we are collecting the model responses on the test dataset so that we can evaluate them later\n",
    "\n",
    "\n",
    "- Starting with the original model before finetuning, load the model using the LitGPT Python API (`LLM.load` ...)\n",
    "- Then use the `LLM.generate` function to generate the responses for the test data\n",
    "- The following utility function will help you to format the test set entries as input text for the LLM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "af804ccf-2fd5-4d16-ae9a-15274f405b60",
   "metadata": {},
   "outputs": [],
   "source": [
    "def format_input(entry):\n",
    "    instruction_text = (\n",
    "        f\"Below is an instruction that describes a task. \"\n",
    "        f\"Write a response that appropriately completes the request.\"\n",
    "        f\"\\n\\n### Instruction:\\n{entry['instruction']}\"\n",
    "    )\n",
    "\n",
    "    input_text = f\"\\n\\n### Input:\\n{entry['input']}\" if entry[\"input\"] else \"\"\n",
    "\n",
    "    return instruction_text + input_text\n",
    "\n",
    "print(format_input(test_data[0]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cf2c2ba0",
   "metadata": {},
   "outputs": [],
   "source": [
    "from litgpt import LLM\n",
    "\n",
    "llm = LLM.load(\"microsoft/phi-2\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c3b43aec-a9c4-45ee-83a3-146b2413417e",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tqdm import tqdm\n",
    "\n",
    "for i in tqdm(range(len(test_data))):\n",
    "    response = llm.generate(test_data[i])\n",
    "    test_data[i][\"base_model\"] = response"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b6cc7b91-6c2e-4a09-98b6-0e3cd1192aec",
   "metadata": {},
   "source": [
    "- Using this utility function, generate and save all the test set responses generated by the model and add them to the `test_set`\n",
    "- For example, if `test_data[0]` entry is as follows before:\n",
    "    \n",
    "```\n",
    "{'instruction': 'Rewrite the sentence using a simile.',\n",
    " 'input': 'The car is very fast.',\n",
    " 'output': 'The car is as fast as lightning.'}\n",
    "```\n",
    "\n",
    "- Modify the `test_data` entry so that it contains the model response:\n",
    "    \n",
    "```\n",
    "{'instruction': 'Rewrite the sentence using a simile.',\n",
    " 'input': 'The car is very fast.',\n",
    " 'output': 'The car is as fast as lightning.',\n",
    " 'base_model': 'The car is as fast as a cheetah sprinting across the savannah.'\n",
    "}\n",
    "```\n",
    "\n",
    "- Do this for all test set entries, and then save the modified `test_data` dictionary as `test_base_model.json`\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "88b92e98",
   "metadata": {},
   "outputs": [],
   "source": [
    "test_data[1]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5c45c1d2-3295-434f-b30c-3cd4cac89367",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "\n",
    "# Exercise 2: Generate and save the test set model responses of the finetuned model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d226d857-68f7-43f3-90c8-4534747b158a",
   "metadata": {},
   "source": [
    "- Repeat the steps from the previous exercise but this time collect the responses of the finetuned model\n",
    "- Save the resulting `test_data` dictionary as `test_base_and_finetuned_model.json`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fbdeb748-e3a9-4671-ae9b-046e6f298044",
   "metadata": {},
   "source": [
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "<br>\n",
    "\n",
    "# Solution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5d22c20a",
   "metadata": {},
   "outputs": [],
   "source": [
    "from litgpt import LLM\n",
    "\n",
    "del llm\n",
    "llm2 = LLM.load(\"out/finetune/lora/final/\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c7300afe",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tqdm import tqdm\n",
    "\n",
    "for i in tqdm(range(len(test_data))):\n",
    "    response = llm2.generate(test_data[i])\n",
    "    test_data[i][\"finetuned_model\"] = response"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "gpuType": "V100",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
